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1 A real-time integration of concept-based search and summarization on Chinese 
websites 



Joe F. Zhou, Weiquan Liu 

October 2000 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods 
in natural language processing and very large corpora: held in 
conjunction with the 38th Annual Meeting of the Association for 
Computational Linguistics - Volume 13 

Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(685.02 KB) Additional Information: full citation , abstract , references 

This paper introduces an intuitive search environment for casual and novice Chinese users 
over Internet. The system consists of four components, a concept network, a query 
reformulation model, a standard search engine, and an automatic summarizer. When the 
user enters one or more fairly general and vague terms, the search engine returns an 
initial answer set, and at the same time pipes the query to the concept network that 
connects thousands of conceptual nodes, each referring to a specific cone ... 

Challenges in information retrieval and language modeling: report of a workshop held Q 
at the center for intelligent information retrieval, University of Massachusetts 
Amherst, September 2002 

James Allan, Jay Aslam, Nicholas Belkin, Chris Buckley, Jamie Callan, Bruce Croft, Sue 
Dumais, Norbert Fuhr, Donna Harman, David J. Harper, Djoerd Hiemstra, Thomas Hofmann, 
Eduard Hovy, Wessel Kraaij, John Lafferty, Victor Lavrenko, David Lewis, Liz Liddy, R. 
Manmatha, Andrew McCallum, Jay Ponte, John Prager, Dragomir Radev, Philip Resnik, 
Stephen Robertson, Roni Rosenfeld, Salim Roukos, Mark Sanderson, Rich Schwartz, Amit 
Singhal, Alan Smeaton, Howard Turtle, Ellen Voorhees, Ralph Weischedel, Jinxi Xu, 
ChengXiang Zhai 

April 2003 ACM SIGIR Forum, Volume 37 Issue 1 
Publisher: ACM Press 

Full text available: ^ pdf(1.60MB) Additional Information: full citation , citings , index terms , review 
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July 1991 ACM Transactions on Information Systems (TOIS), Volume 9 issue 3 
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review 



Keywords: Expert Systems, full-text information retrieval, online search assistance, 
query reformulation, textbases 



Research track paper: Query chains: learning to rank from implicit feedback 
Filip Radlinski, Thorsten Joachims 

August 2005 Proceeding of the eleventh ACM SIGKDD international conference on 
Knowledge discovery in data mining KDD '05 

Publisher: ACM Press 

Full text available* fiB pdf(572 84 KB) Additional Information: full citation , abstract , references , citings , index 

1 ; terms 

This paper presents a novel approach for using clickthrough data to learn ranked retrieval 
functions for web search results. We observe that users searching the web often perform 
a sequence, or chain, of queries with a similar information need. Using query chains, we 
generate new types of preference judgments from search engine logs, thus taking 
advantage of user intelligence in reformulating queries. To validate our method we 
perform a controlled user study comparing generated preference judgme ... 

Keywords: clickthrough data, implicit feedback, machine learning, search engines, 
support vector machines 



Retrieving software objects in an example-based programming environment 
Scott Henninger 

September 1991 Proceedings of the 14th annual international ACM SIGIR conference 
on Research and development in information retrieval SIGIR '91 

Publisher: ACM Press 

Full text available: ^ pdf(1.19 MB) Additional Information: full citation , references , citings .* index terms 
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6 Web mining with search engines: Generating query substitutions 
^ Rosie Jones, Benjamin Rey, Omid Madani, Wiley Greiner 

v May 2006 Proceedings of the 15th international conference on World Wide Web 
WWW 06 

Publisher: ACM Press 

Full text available: ^ pdf(305.01 KB) Additional Information: full citation , abstract , references , index terms 

We introduce the notion of query substitution, that is, generating a new query to replace 
a user's original search query. Our technique uses modifications based on typical 
substitutions web searchers make to their queries. In this way the new query is strongly 
related to the original query, containing terms closely related to all of the original terms. 
This contrasts with query expansion through pseudo-relevance feedback, which is costly 
and can lead to query drift. This also contrasts with quer ... 

Keywords: paraphrasing, query rewriting, query substitution, sponsored search 



7 Web: Building a web thesaurus from web link structure 

Zheng Chen, Shengping Liu, Liu Wenyin, Geguang Pu, Wei-Ying Ma 
July 2003 Proceedings of the 26th annual international ACM SIGIR conference on 
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Research and development in informaion retrieval SIGIR '03 
Publisher: ACM Press 

Full text available* odf(292 05 KB) Add i tional Information: full citation , abstract , references , citings , index 

! terms 

Thesaurus has been widely used in many applications, including information retrieval, 
natural language processing, and question answering. In this paper, we propose a novel 
approach to automatically constructing a domain-specific thesaurus from the Web using 
link structure information. The proposed approach is able to identify new terms and 
reflect the latest relationship between terms as the Web evolves. First, a set of high 
quality and representative websites of -a specific domain is selected. ... 

Keywords: content structure, link analysis, query expansion, thesaurus 



Learning search engine specific query transformations for question answering 
Eugene Agichtein, Steve Lawrence, Luis Gravano 

April 2001 Proceedings of the 10th international conference on World Wide Web 
WWW '01 

Publisher: ACM Press 

Full text available: ^ pdf(205.68 KB) Additional Information: full citation , references , citings , index terms 



Keywords: information retrieval, query expansion, question answering, web search 



9 Relevance feedback: Using web-graph distance for relevance feedback in web 
search 

Sergei Vassilvitskii, Eric Brill 

August 2006 Proceedings of the 29th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '06 

Publisher: ACM Press 

Full text available: ^ pdf(885.86 KB) Additional Information: full citation , abstract , references , index terms 

We study the effect of user supplied relevance feedback in improving web search results. 
Rather than using query refinement or document similarity measures to rerank results, 
we show that the web-graph distance between two documents is a robust measure of 
their relative relevancy. We demonstrate how the use of this metric can improve the 
rankings of result URLs, even when the user only rates one document in the dataset. Our 
research suggests that such interactive systems can significantly improv ... 

Keywords: link analysis, relevance feedback, web search 



10 Evaluation of an expert system for searching in full text 
S. Gauch 

December 1989 Proceedings of the 13th annual international ACM SIGIR conference 

on Research and development in information retrieval SIGIR '90 
Publisher: ACM Press 

Full text available: f Spdf(1.44 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

This paper presents a prototype expert system which provides online search assistance. 
The expert system automatically reformulates queries, using an online thesaurus as the 
source of domain knowledge, and a knowledge base of domain-independent search 
tactics. The expert system works with a full-text database which requires no syntactic or 
semantic pre-processing. In addition, the expert system ranks the retrieved passages in 
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decreasing order of probable relevance. Users' search ... 

11 An evolutionary approach to constructing effective software reuse repositories 
Scott Henninger 

April 1997 ACM Transactions on Software Engineering and Methodology (TOSEM), 

Volume 6 Issue 2 
Publisher: ACM Press 

Full text available- IfH pdf(662 79 KB) Additiona ' Information: full citation , abstract , references , citings , index 
"^"^ '' terms , review 

Repositories for software reuse are faced vyith two interrelated problems: (1) acquiring 
the knowledge to initially construct the repository and (2) modifying the repository to 
meet the evolving and dynamic needs of software development organizations. Current 
software repository methods rely heavily on classification, which exacerbates acquistition 
and evolution problems by requiring costly classification and domain analysis efforts 
before a repository can be used effectively, This article o ... 

Keywords: component repositories, information retrieval, software reuse 



12 XML data management and P2P: SRI: exploiting semantic information for effective 
query routing in a PDMS 

Federica Mandreoli, Riccardo Martoglia, Simona Sassatelli, Wilma Penzo 
November 2006 Proceedings of the eighth ACM international workshop on Web 

information and data management WIDM '06 
Publisher: ACM Press 

Full text available: ^ pdf(257.14 KB) Additional Information: full citation , abstract , references , index terms 

The huge amount of data available from Internet information sources has focused much 
attention on the sharing of distributed information through Peer Data Management 
Systems (PDMSs). In a PDMS, peers have a schema on their local data, and they are 
related each other through semantic mappings that can be defined between their own 
schemas. Querying a PDMS means either flooding the network with messages to all peers 
or take advantage of a routing mechanism to reformulate a query only on the best< ... 

Keywords: data-sharing P2P systems, query routing, semantics 



13 A unified framework for semantics and feature based relevance feedback in image 
^ retrieval systems 

^ Ye Lu, Chunhui Hu, Xingquan Zhu, HongJiang Zhang, Qiang Yang 

October 2000 Proceedings of the eighth ACM international conference on Multimedia 

MULTIMEDIA '00 
Publisher: ACM Press 

Full text available: fiB pdff696.54 KB) Additional Information: full citation , abstract, references , citings, index 
l£=h ^ terms 

The relevance feedback approach to image retrieval is a powerful technique and has been 
an active research direction for the past few years. Various ad hoc parameter estimation 
techniques have been proposed for relevance feedback. In addition, methods that perform 
optimization on multi-level image content model have been formulated. However, these 
methods only perform relevance feedback on the low-level image features and fail to 
address the images' semantic content. In this paper, we propose ... 

Keywords: image retrieval, image semantics, multimedia database, relevance feedback 
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14 Articles: Toward an ontology-enhanced information filtering agent I 
Kwang Mong Sim 

March 2004 ACM SIGMOD Record, Volume 33 issue l 
Publisher: ACM Press 

Full text available: |g) pdf(240.41 KB) Additional Information: full citation , abstract , references 

Whereas search engines assist users in locating initial information sources, often an 
overwhelmingly large number of ULRs is returned, and the task of browsing websites rests 
heavily on users. The contribution of this work is developing an information filtering agent 
(IFA) that assists users in identifying out-of-context web pages and rating the relevance 
of web pages. An IFA determines the relevance of web pages by adopting three 
heuristics: (i) detecting evidence phrase ... 

15 New search paradigms: Searching with context | 
Reiner Kraft, Chi Chao Chang, Farzin Maghoul, Ravi Kumar 

May 2006 Proceedings of the 15th international conference on World Wide Web 
WWW '06 

Publisher: ACM Press 

Full text available: ^ pdf(191.91 KB) Additional Information: full citation , abstract , references , index terms 

Contextual search refers to proactively capturing the information need of a user by 
automatically augmenting the user query with information extracted from the search 
context; for example, by using terms from the web page the user is currently browsing or 
a file the user is currently editing. We present three different algorithms to implement 
contextual search for the Web. The first, it query rewriting (QR), augments each query 
with appropriate terms from the search context and uses an off ... 

Keywords: contextual search, meta-search, rank aggregation, specialized search 
engines, web search 



16 Supporting the construction and evolution of component repositories 
Scott Henninger 

May 1996 Proceedings of the 18th international conference on Software engineering 
ICSE '96 

Publisher: IEEE Computer Society 

Full text available: ^ (f| 

"|gjpaf(1.Q8 MB) ^P 1 Additional Information: full citation , abstract , references , index terms 
Publisher Site 

Repositories must be designed to meet the evolving and dynamic needs of software 
development organizations. Current software repository methods rely heavily on 
classification, which exacerbates acquisition and evolution problems by requiring costly 
classification and domain analysis efforts before a repository can be used effectively. This 
paper outlines an approach in which minimal initial structure is used to effectively find 
relevant software components while methods are employed to increment ... 

Keywords: CodeFinder, PEEL, classification, component repositories, domain analysis, 
minimal initial structure, retrieval system, reusable software artifacts, software 
development, software engineering, software repository methods, software reusability 



17 Structured document handling — a case for integrating databases and information 
^ retrieval 

Klemens Bohm, Adrian Muller, Erich Neuhold 

November 1994 Proceedings of the third international conference on Information and 
knowledge management CIKM '94 
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Publisher: ACM Press 

Full text available: ^pdf(1.01 MB) Additional Information: full citation , abstract , references , index terms 

In this paper we discuss the structured multimedia documents that will be, or already are, 
to some degree the communication backbone of the so-called superhighways. It will be 
shown that storage and retrieval of such documents will best be handled by an integration 
of database and information retrieval technologies. We assume documents to be 
structured with the help of standards like SGML/HyTime and represented by the multitude 
of formats currently used for multimedia data.Starti ... 

18 A survey of Web metrics 

Devanshu Dhyani, Wee Keong Ng, Sourav S. Bhowmick 
December 2002 ACM Computing Surveys (CSUR), volume 34 issue 4 

Publisher: ACM Press 

Full text available* ^ pdf(289 28 KB) Addjtional Information: full citation , abstract , references , citings , index 

The unabated growth and increasing significance of the World Wide Web has resulted in a 
flurry of research activity to improve its capacity for serving information more effectively. 
But at the heart of these efforts lie implicit assumptions about "quality" and "usefulness" 
of Web resources and services. This observation points towards measurements and 
models that quantify various attributes of web sites. The science of measuring all aspects 
of information, especially its storage and retrieval or ... 

Keywords: Information theoretic, PageRank, Web graph, Web metrics, Web page 
similarity, quality metrics 



19 Industrial and government applications track posters: Identifying "best bet" web 
search results by mining past user behavior 
Eugene Agichtein, Zijian Zheng 

August 2006 Proceedings of the 12th ACM SIGKDD international conference on 

Knowledge discovery and data mining KDD '06 
Publisher: ACM Press 

Full text available: ^| pdf(829.11 KB) Additional Information: full citation , abstract , references , index terms 

The top web search result is crucial for user satisfaction with the web search experience. 
We argue that the importance of the relevance at the top position necessitates special 
handling of the top web search result for some queries. We propose an effective approach 
of leveraging millions of past user interactions with a web search engine to automatically 
detect "best bet" top results preferred by majority of users. Interestingly, this problem 
can be more effectively addressed with classificatio ... 

Keywords: user behavior mining, web search ranking, web usage mining 



20 XIRQL: An XML query language based on information retrieval concepts 
Norbert Fuhr, Kai Gropjohann 

April 2004 ACM Transactions on Information Systems (TOIS), Volume 22 issue 2 
Publisher: ACM Press 

Full text available* pdf(281 91 KB) Add ' tional Information: full citation , abstract , references , citings , index 

terms 

XIRQL ("circle") is an XML query language that incorporates imprecision and vagueness 
for both structural and content-oriented query conditions. The corresponding uncertainty 
is handled by a consistent probabilistic model. The core features of XIRQL are (1) 
document ranking based on index term weighting, (2) specificity-oriented search for 
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retrieving the most relevant parts of documents, (3) datatypes with vague predicates for 
dealing with specific types of content and (4) structural vagueness f ... 

Keywords: Path algebra, XML, XQuery, probabilistic retrieval, ranked retrieval, vague 
predicates 
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